home *** CD-ROM | disk | FTP | other *** search
- --==| UNHTML v1.3 |==--
-
- (C)opyright 1996 by Jawed Karim <kari0022@gold.tc.umn.edu>
-
-
-
- What's New
- ==========
-
- UNHTML 1.3 has several improvements over 1.0 :
-
-
- o The output files contain fewer empty lines, thus
- reducing its size.
-
- o An ELF executable for Linux is included.
-
- o An editor can be launched after completion to
- manually edit the output file.
-
- o UNHTML counts how many HTML tags were removed.
-
- o Special character symbols '&' and ';' no longer
- cause trouble within '<' and '>'.
-
-
- Instructions
- ============
-
- XXXX unhtml v1.3 -- Removes HTML code from ascii files.
- (C)opyright 1996 by Jawed Karim <kari0022@gold.tc.umn.edu>
-
- syntax: unhtml <inputfile> <outputfile>
-
-
- <inputfile> : The file that contains HTML code.
-
- <outputfile>: After removing the HTML code, the text
- will be written to this file.
-
-
- EXAMPLE: unhtml index.html index.txt
-
- Will remove any HTML code from index.html and write plain text
- to file index.txt.
-
- After completion, the following message will be displayed:
-
- ---------- Done. Removed 110 HTML tags ----------
-
- edit index.txt manually [y] ?
-
- If you would like to edit the output file manually with a text
- editor, press 'y' at this point. If not, just hit enter. UNHTML
- will execute a batch file, depending on which system you are
- using.
-
- under Linux: command 'pico' will be executed
- under MSDOS: command 'edit' will be executed
- under OS/2 : command 'tedit' will be executed
-
- Should you get an error message under MSDOS or OS/2, make a
- batchfile that points to an editor such as the following
- example of a DOS BATCHFILE:
-
- ---CUT HERE---
- c:\dos\edit %1
- ---CUT HERE---
-
- Save this file as 'EDIT.BAT' in the same path as UNHTML, or have
- it in a path that is contained in your PATH variable.
-
- Accordingly the OS/2 BATCHFILE would look like this:
-
- ---CUT HERE---
- c:\os2\tedit.exe %1
- ---CUT HERE---
-
- Save this file as 'TEDIT.CMD' in the same path as UNHTML, or have
- it in a path that is contained in your PATH variable.
-
- Under Linux, if you get an error message, make a symbolic link
- that points to whichever editor you use. Name the link 'pico'.
- For more help, see: man ln
-
-
- OS/2 Warp
- =========
-
- Compiler used: OS/2 EMX GCC v2.7.2
-
- This executable requires you to have the EMX Runtime version v0.9b or
- higher. It is available at:
-
- ftp://hobbes.nmsu.edu/os2/unix/emx09b/emxrt.zip
-
- This is worth getting since you will be able to use long filenames with
- UNHTML for OS/2.
-
-
- Linux
- =====
-
- Compiler used: GNU GCC v2.7.0
-
- This ELF executable has been tested under Linux 1.2.13.
-
-
- MSDOS
- =====
-
- Compiler used: djgpp GCC v2.6.3
-
- Unless you are running UNHTML for MSDOS in an OS/2, or Windows(95/3.1/NT)
- DOS window, you need to have the file CWSDPMI.EXE in your path variable,
- or in the same directory as UNHTML.
-
-
- Known Problems
- ==============
-
- Right now, UNHTML assumes that HTML code follows after any '&' or '<'
- character and is terminated with ';' or '>'. The exception to this is the
- case where '&' or ';' appear within '<' and '>'. Therefore, any of these
- characters that are not part of an HTML tag may cause problems.
-
-
- Where to find updates
- =====================
-
- New UNHTML versions will be posted on:
-
- http://umn.edu/~kari0022
-
- or search for "Jawed Karim" on Yahoo! (http://www.yahoo.com)
-
- or email Jawed Karim at:
-
- Jawed.Karim-1@umn.edu
- kari0022@gold.tc.umn.edu
-
- -----BEGIN PGP PUBLIC KEY BLOCK-----
- Version: 2.6.2
-
- mQBtAzAEEsYAAAEDAKkXRZuRhuJ919uqvT4jzBRNw5Xi6+N5uH3QIoyPR1qeA3NW
- 60ji+3Yo2lOewzKrw0z8Aon5KsCfR/dAYJKpWIbQCI9WEedArFRxP48ClsHneWB9
- VYmMQnpu4PUi2KOHDQAFEbQmSmF3ZWQgS2FyaW0gPGthcmkwMDIyQGdvbGQudGMu
- dW1uLmVkdT4=
- =O8+H
- -----END PGP PUBLIC KEY BLOCK-------
-
-